Swaps in protein sequences.
نویسندگان
چکیده
An important question in protein evolution is to what extent proteins may have undergone swaps (switches of domain or fragment order) during evolution. Such events might have occurred in several forms: Swaps of short fragments, swaps of structural and functional motifs, or recombination of domains in multidomain proteins. This question is important for the theoretical understanding of the evolution of proteins, and has practical implications for using swaps as a design tool in protein engineering. In order to analyze the question systematically, we conducted a large scale survey of possible swaps and permutations among all pairs of protein from the Swissport database. A swap is defined as a specific kind of sequence mutation between two proteins in which two fragments that appear in both sequences have different relative order in the two sequences. For example, aXbYc and dYeXf are defined as a swap, where X and Y represent sequence fragments that switched their order. Identifying such swaps is difficult using standard sequence comparison packages. One of the main problems in the analysis stems from the fact that many sequences contain repeats, which may be identified as false-positive swaps. We have used two different approaches to detect pairs of proteins with swaps. The first approach is based on the predefined list of domains in Pfam. We identified all the proteins that share at least two domains and analyzed their relative order, looking for pairs in which the order of these domains was switched. We designed an algorithm to distinguish between real swaps and duplications. In the second approach, we used Blast to detect pairs of proteins that share several fragments. Then, we used an automatic procedure to select pairs that are likely to contain swaps. Those pairs were analyzed visually, using a graphical tool, to eliminate duplications. Combining these approaches, about 140 different cases of swaps in the Swissprot database were found (after eliminating multiple pairs within the same family). Some of the cases have been described in the literature, but many are novel examples. Although each new example identified may be interesting to analyze, our main conclusion is that cases of swaps are rare in protein evolution. This observation is at odds with the common view that proteins are very modular to the point that modules (e.g., domains) can be shuffled between proteins with minimal constraints. Our study suggests that sequential constraints, i.e., the relative order between domains, are highly conserved.
منابع مشابه
The connectivity of graphs of graphs with self-loops and a given degree sequence
‘Double edge swaps’ transform one graph into another while preserving the graph’s degree sequence, and have thus been used in a number of popular Markov chain Monte Carlo (MCMC) sampling techniques. However, while double edge-swaps can transform, for any fixed degree sequence, any two graphs inside the classes of simple graphs, multigraphs, and pseudographs, this is not true for graphs which al...
متن کاملAnalysis and Professional Designing of COBRA (Computationally Optimized Broadly Reactive Antigen) Vaccine for Bm86 midgut Protein of R. microplus and R. annulatus Ticks
Introduction: The cattle tick Rhipicephalus spp. causes significant economic losses due to diseases in animals and human. Bm86 is a midgut protein and vaccine candidate, which its sequences among the isolates of Ripsephalus spp are geographically separated, variable, and are the main reason for reducing effectiveness, and subsequently, the failure of the recombinant vaccines. Method: In this bi...
متن کاملAnalysis and Professional Designing of COBRA (Computationally Optimized Broadly Reactive Antigen) Vaccine for Bm86 midgut Protein of R. microplus and R. annulatus Ticks
Introduction: The cattle tick Rhipicephalus spp. causes significant economic losses due to diseases in animals and human. Bm86 is a midgut protein and vaccine candidate, which its sequences among the isolates of Ripsephalus spp are geographically separated, variable, and are the main reason for reducing effectiveness, and subsequently, the failure of the recombinant vaccines. Method: In this bi...
متن کاملAn Evolutionary Relationship Between Stearoyl-CoA Desaturase (SCD) Protein Sequences Involved in Fatty Acid Metabolism
Background: Stearoyl-CoA desaturase (SCD) is a key enzyme that converts saturated fatty acids (SFAs) to monounsaturated fatty acids (MUFAs) in fat biosynthesis. Despite being crucial for interpreting SCDs’ roles across species, the evolutionary relationship of SCD proteins across species has yet to be elucidated. This study aims to present this evolutionary relationship based on amino aci...
متن کاملAn Application of Genetic Network Programming Model for Pricing of Basket Default Swaps (BDS)
The credit derivatives market has experienced remarkable growth over the past decade. As such, there is a growing interest in tools for pricing of the most prominent credit derivative, the credit default swap (CDS). In this paper, we propose a heuristic algorithm for pricing of basket default swaps (BDS). For this purpose, genetic network programming (GNP), which is one of the recent evolutiona...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Proteins
دوره 48 2 شماره
صفحات -
تاریخ انتشار 2002